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1.  Summary 

This  STIR  project  focuses  on  the  development  of  3D  shape  matching  and  recognition 
techniques  specially  targeted  for  3D  modeling.  The  ultimate  objective  of  our  research  is 
to  develop  effective  methods  for  rapid  creation  of  large-scale  3D  models  from  3D 
geospatial  sensor  data,  and  specifically  LiDAR  point  clouds.  Our  technical  approach  is  a 
novel  alternative  to  traditional  modeling  approaches.  The  novelty  arises  from  using  the 
strategy  of  Modeling  by  Recognition  (MBR)  to  rapidly  identify  objects  from  a  3D 
library  of  objects  within  point-cloud  data.  The  recognized-object  point  clouds  are  then 
replaced  with  library  data,  such  as  polygon  surface  models,  thereby  constructing  accurate 
and  complete  3D  scene  models. 

Our  research  foci  are  the  key  components  in  the  proposed  modeling  approach:  the  3D 
shape  matching  algorithms  that  are  used  to  detect  objects  of  interest  from  point-cloud 
inputs  and  match  them  to  model-library  elements. 

We  pursued  shape  matching  algorithms  in  two  ways:  (1)  Global  primitive  analysis  for 
automatically  detecting  and  extracting  primitive  shape  geometry  such  as  planes,  cylinders 
and  cuboids  from  point-cloud  data,  and  (2)  Local-feature  techniques  for  representation  of 
point-cloud  features  to  produce  unique  3D  geometric  descriptions  for  general  3D  shape 
matching.  We  designed  and  implemented  the  approaches,  and  then  evaluated  them 
extensively  with  various  datasets.  Ultimately,  these  methods  can  become  the  core  of  a 
unified  framework  for  automatic  matching  of  point  cloud  data  to  a  library  of  model 
components  for  creating  both  3D  models  and  object  recognition/labeling. 

The  STIR  was  pursued  to  flesh  out  some  initial  ideas  in  an  earlier  proposal.  Reviewers 
expressed  concerns  about  several  issues  that  we  addressed  over  the  STIR  term.  These 
issues  are  summarized  below. 

Our  initial  proposal  provided  little  detail  about  the  efficiency  of  our  proposed  3D 
matching  methods.  There  was  skepticism  about  the  feasibility  of  doing  3D  matching  in  a 
manner  that  is  robust  to  noise  and  sparse  data.  Some  specific  methods  and  initial  results 
were  needed  to  claim  feasibility  and  focus  the  effort. 

We  believe  that  we’ve  addressed  these  issues  over  the  past  months.  We  have  developed 
and  evaluated  initial  algorithms.  We  show  results  for  real  world  data  examples  and 
provide  measured  performance.  We  believe  the  progress  we  made  and  the  results  of  that 
effort  are  encouraging  and  make  a  compelling  case  that  Modeling  by  Recognition 
(MBR)  is  in  fact  feasible  and  attractive  as  a  new  approach  to  the  scene  modeling  and 
object  labeling  problems. 


2.  Description  of  research  achievements 

2.1  Techniques  for  detecting  and  modeling  of  primitive  shapes 

We  focused  on  developing  algorithms  for  automatic  detection  and  extraction  of  primitive 
shapes  and  features  such  as  planes,  cylinders  and  cuboids  from  point-cloud  data.  We 
observe  that  a  majority  of  urban  man-made  objects  in  existence  are  composed  of 
primitives  that  often  have  planarity  and  regularity  properties.  This  majority  of  object 
should  be  detected  first  and  fitted  to  the  raw  data  to  form  a  simple  and  clean 
representation  of  the  scene.  The  primitive  shapes  also  serve  as  key  backbones  in  further 
modeling  process,  providing  additional  constraints  for  the  search  for  associated  objects 
and  propagate  spatial  relationships.  For  example,  in  aerial  LiDAR  of  urban  scenes,  roofs 
and  ground  are  often  planar.  In  street-level  LiDAR,  facades  and  ground  are  often  planar. 
Attempting  to  match  planar  surfaces  is  unlikely  to  succeed  due  to  the  lack  of 
discriminating  local  features.  Instead,  our  strategy  is  to  extract  the  planar  and  cylinder 
shapes  first  using  their  global  geometric  properties.  Modeling  can  proceed  directly  by 
replacing  their  point  clouds  with  geometric  surfaces.  Labeling  will  require  further 
analysis,  which  is  beyond  our  scope  in  the  STIR.  However,  we  speculate  that  recognition 
based  on  a  high-level  description  of  planes  and  cylinders  may  be  more  tractable  and 
successful  than  recognition  based  on  low-level  point-clouds, 

An  effective  method  for  automatically  detecting  and  modeling  primitive  shapes  from 
point  clouds  is  based  on  the  concepts  of  the  Gaussian  sphere  and  global  analysis  derived 
from  differential  geometry.  We  employed  two  forms  of  global  information:  surface 
nonnals  and  axis-orientation  to  measure  the  regularity  of  surface.  We  adopt  a  statistical 
model  based  on  normal  analysis,  which  detects  this  global  infonnation  automatically. 
Normals  are  computed  for  each  data  point  using  covariance  analysis.  The  data  points  are 
converted  to  Hermite  form  or  “oriented  points”,  that  is,  3D  point  positions  with  a  normal 
vector.  The  Hermite  points  are  then  projected  onto  a  Normal  sphere  (Gaussian  sphere), 
which  establishes  distinctive  patterns  for  feature  detection  and  modeling  (Figure  1). 
Circles  indicate  cylinders  and  point-clusters  indicate  planar  surfaces  are  present.  Figure  1 
shows  an  example  of  a  sphere  map.  The  rings  are  distinctive  signatures  of  Hermite  scan 
data  from  cylinders  of  varied  orientation.  Color  spots  indicate  aligned  surface  normals  of 


(a)  (b)  (c) 


Figure  1:  Illustration  for  plane  detection:  (a)  Points  on  the  parallel  planes  have  similar  normal, 
(b)  Normals  are  projected  onto  a  Gaussian  sphere  and  clusters  are  detected;  (c)  Points  are 
segmented  based  on  locality  of  points  within  clusters. 


planar  surfaces.  We  process  varied  sizes  of  scan  volumes  to  produce  these  spheres  and 
then  segment  the  scan  points  on  a  ring  or  at  a  spot.  The  sphere  patterns  are  thus  a 
representation  of  global  object  shape. 

We  have  tested  the  approach  with  various  point  cloud  datasets  including  ground-level 
LiDAR  scans  of  industrial  settings  and  complex  urban  scenes.  Figure  1  shows  a  result  of 
applying  the  method  to  detect  and  extract  primitive  shapes  in  a  portion  of  an  industrial 
scene.  Figure  2  shows  the  modeled  polygons  for  the  primitives  detected  in  the  complete 
industrial  scene.  Figure  3  shows  the  results  of  applying  the  method  to  ground  and  aerial 
LiDAR  of  urban  scenes  (LA  downtown  area).  The  original  LiDAR  data  of  Figure  3  (a) 
and  (b)  contain  100. 1M  and  142. 9M  scanned  points,  respectively.  Our  approach  detected 
and  inferred  223  and  408  object  clusters  as  primitive  objects.  The  entire  processing  is 
completely  automatic  and  the  processing  times  are  410.7s  and  482.5s,  respectively. 


Figure  2:  this  industrial  site  scan  model  is  automatically  created  from  planar  and  cylinder 
primitives  extracted  and  modeled  by  our  algorithms. 


(b) 


Figure  3:  Detected  primitive  shapes  from  point  cloud  data  captured  by  ground  and  aerial 
LiDAR  sensors  (LA  downtown  area). 


2.2  Techniques  for  feature-based  scene  description  and  shape  matching 

The  second  major  focus  is  on  algorithms  for  representing  point-cloud  features  that  encode 
unique  local  geometric  properties  for  general  3D  shape  matching.  Specially,  we 
developed  a  novel  representation  and  matching  process  based  on  the  concept  of  self¬ 
similarities.  Self-similarity  is  a  unique  property  of  fractals  and  topological  geometry.  It 
captures  the  internal  geometric  layout  of  local  patterns  in  a  level  of  abstraction. 
Locations  in  images  with  self-similarity  structure  of  a  local  pattern  are  distinguishable 
from  locations  in  their  neighbors,  which  can  greatly  facilitate  matching  across  images 
that  appear  substantially  different  at  pixel  level  [Huai  1]. 

We  developed  a  unique  3D  self-similarity  feature  descriptor  and  built  a  matching 
framework  based  on  the  descriptor  for  general  shape  matching.  We  define  self-similarity 
as  the  property  that  is  held  by  those  parts  of  data  that  resemble  themselves  in  comparison 
to  other  parts  of  the  data.  The  resemblance  can  be  photometric  properties,  geometric 
properties  or  their  combinations.  Photometric  properties  such  as  color,  intensity  or 
texture  are  useful  for  imagery,  but  they  are  unlikely  to  be  useful  on  point  clouds.  We 
therefore  turned  to  geometric  properties  as  the  essential  information  to  use.  Surface 
nonnals  and  curvatures  characterize  the  geometric  properties  of  a  local  surface;  therefore 
we  used  these  as  self-similarity  measurements  to  produce  3D  feature  descriptions  for 
point  matching.  We  can  also  considered  photometric  information  in  our  descriptor  and 
matching  algorithms  to  generalize  the  problem. 

The  surface  normal  is  an  effective  geometric  property  that  enables  human  visual 
perception  to  distinguish  local  surfaces  or  shapes  in  point  clouds.  Normal  similarity  is 
robust  to  a  wide  range  of  variations  that  occur  within  disparate  object  classes. 
Furthermore,  3D  point  positions  with  normal  vector  (i.e.  Hermite  data)  form  a  local 
cylindrical  coordinate  system  that  provides  a  view-independent  description  of  a  surface. 
Figure  3  shows  corresponding  descriptors  based  on  surface  nonnals  for  variations  of 
point  clouds  obtained  for  similar  surface  shape. 


Point  clouds  1 


Point  clouds  2 


Point  clouds  3 


Original  Points  Surface  normal  self-similarity  surface  self-similarity  descriptor 


Figure  3:  Illustration  for  using  surface  normals  to  produce  a  3D  self-similarity  feature 
descriptor.  The  local  internal  layouts  of  self-similarities  are  shared  by  point  cloud  data  with 
different  geometric  variations. 


Curvature  is  another  important  geometric  property  we  incorporated  in  similarity 
measurement.  The  curvature  illustrates  the  changing  rate  of  tangents.  Curved  surfaces 
always  have  varying  normals,  yet  many  natural  shapes  such  as  spheres  and  cylinders 
preserve  curvature  consistency.  Since  there  are  many  possible  directions  of  curvature  in 
3D,  we  used  the  direction  in  which  the  curvature  is  maximized,  i.e.  the  principal 
curvature,  to  make  it  unique. 

We  extensively  evaluated  our  3D  self-similarity  descriptor  and  matching  process  in  terms 
of  robustness,  accuracy  and  speed.  Figure  4  shows  some  results  of  quantitative 
evaluation  with  SFIREC  benchmark  datasets,  and  Table  1  shows  the  statistics  of 
performance  in  terms  of  matching  accuracy,  robustness  and  computation  time.  Figure  5 
shows  the  results  on  LiDAR  point  clouds,  and  Figure  6  shows  the  performance  measures 
in  precision-recall  curves.  More  results  and  technical  details  of  the  method  can  be  found 
in  our  recent  publication  [FIual2].  The  results  show  that  our  3D  self-similarity 
algorithms  efficiently  capture  distinctive  geometric  signatures  embedded  in  point  clouds. 
The  resulting  3D  self-similarity  descriptor  is  compact  and  view/scale-independent,  and 
hence  can  produce  highly  efficient  feature  representation  and  matching  of  point  clouds. 


Rotation 


Holes  and  sampling 


Figure  4:  Quantitative  evaluation  results  of  our  novel  3D  self-similarity  descriptor  with 
SHREC  benchmark  datasets. 


Table  1:  Statistics  of  performance  for  SHREC  datasets 


Dataset 

Points  # 

Feature  # 

Match  # 

Time  (s) 

Rotation 

52,565  vs.  52,565 

3,369  vs.  3,370 

615 

300.9 

Scale 

52,565  vs.  52,565 

3,370  vs.  3,927 

520 

341.7 

Affine 

52,565  vs.  52,565 

3,370  vs.  3,688 

257 

343.8 

Hole 

5,4410  vs.  5,2565 

2,487  vs.  2,942 

522 

300.3 

Pose 

52,565  vs.  52,565 

3,370  vs.  2,955 

202 

343.8 

Topology 

52,565  vs.  52,565 

2,942  vs.  2,955 

82 

292.6 

Figure  5:  3D  self-similarity  matching  of  aerial  LiDAR  point  clouds  (Vancouver  area). 


Figure  6:  Precision-recall  curves  of  matching  on  the  LiDAR  point  data  shown  in  Figure  5. 


We  developed  an  initial  but  complete  matching  system  by  combining  the  global  primitive 
shape  matching  and  the  local  self-similarity  descriptor  matching.  The  system  is  a  focus- 
of- attention  structure  that  first  detects  primitive  objects  to  form  key  backbones,  and  then 
the  local  self-similarity  matching  is  applied  with  the  guide  of  the  backbone  objects.  We 
applied  the  system  for  detection  and  recognition  of  commonly  occurring  objects  in  urban 
environments  such  as  lamp  posts,  fire  hydrants,  street  lights,  mailboxes,  chimneys,  and 
vehicles.  Figure  7  shows  some  sample  results. 


(b)  Vehicles 


(d)  Street  lights 
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(f)  traffic  lights 


(c)  Mailbox 


(e)  Pergola 


(a)  Common  urban  objects  to  be  recognized 


Figure  7:  Applying  the  new  matching  approach  for  detection  and  recognition  of  commonly 
occurring  urban  objects  from  LiDAR  point  clouds  (LA  downtown  area). 


3.  Conclusion 


In  conclusion,  we  feel  that  the  issues  raised  by  the  earlier  proposal  reviews  have  been 
addressed.  We  feel  the  computation  times  are  reasonable  given  the  early  stages  of 
algorithm  development  and  use  of  commodity  PC  computing  systems. 

We  think  that  the  feasibly  of  MBR  has  been  shown.  A  combination  of  primitive 
recognition  and  general  shape  recognition  has  be  applied  to  general  LiDAR  data  with 
encouraging  results.  Robustness  to  noise  and  sparse  data  are  difficult  issues  that  require 
further  work,  but  the  results  on  real  scan  data  indicate  that  MBR  methods  can  work 
directly  on  commercial  grade  LiDAR  without  any  special  processing  or  considerations. 

As  for  a  specific  research  plan  to  pursue  from  here,  we  will  elaborate  in  that  in  a  full 
proposal. 
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